Goto

Collaborating Authors

 language pattern


LearningDistinctandRepresentativeModes forImageCaptioning

Neural Information Processing Systems

While mode collapse is typically a side effect for generative modeling, it is somewhat "welcomed" in SoTA image captioning models as it usually facilitates a higher evaluation performance on reference-based metrics like CIDEr, BLEU and SPICE.


Pun Unintended: LLMs and the Illusion of Humor Understanding

arXiv.org Artificial Intelligence

Puns are a form of humorous wordplay that exploits polysemy and phonetic similarity. While LLMs have shown promise in detecting puns, we show in this paper that their understanding often remains shallow, lacking the nuanced grasp typical of human interpretation. By systematically analyzing and reformulating existing pun benchmarks, we demonstrate how subtle changes in puns are sufficient to mislead LLMs. Our contributions include comprehensive and nuanced pun detection benchmarks, human evaluation across recent LLMs, and an analysis of the robustness challenges these models face in processing puns.



Alzheimer's Dementia Detection Using Perplexity from Paired Large Language Models

arXiv.org Artificial Intelligence

Alzheimer's dementia (AD) is a neurodegenerative disorder with cognitive decline that commonly impacts language ability. This work extends the paired perplexity approach to detecting AD by using a recent large language model (LLM), the instruction-following version of Mistral-7B. We improve accuracy by an average of 3.33% over the best current paired perplexity method and by 6.35% over the top-ranked method from the ADReSS 2020 challenge benchmark. Our further analysis demonstrates that the proposed approach can effectively detect AD with a clear and interpretable decision boundary in contrast to other methods that suffer from opaque decision-making processes. Finally, by prompting the fine-tuned LLMs and comparing the model-generated responses to human responses, we illustrate that the LLMs have learned the special language patterns of AD speakers, which opens up possibilities for novel methods of model interpretation and data augmentation.


ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding

arXiv.org Artificial Intelligence

3D visual grounding (3DVG) involves localizing entities in a 3D scene referred to by natural language text. Such models are useful for embodied AI and scene retrieval applications, which involve searching for objects or patterns using natural language descriptions. While recent works have focused on LLM-based scaling of 3DVG datasets, these datasets do not capture the full range of potential prompts which could be specified in the English language. To ensure that we are scaling up and testing against a useful and representative set of prompts, we propose a framework for linguistically analyzing 3DVG prompts and introduce Visual Grounding with Diverse Language in 3D (ViGiL3D), a diagnostic dataset for evaluating visual grounding methods against a diverse set of language patterns. We evaluate existing open-vocabulary 3DVG methods to demonstrate that these methods are not yet proficient in understanding and identifying the targets of more challenging, out-of-distribution prompts, toward real-world applications.


TextAge: A Curated and Diverse Text Dataset for Age Classification

arXiv.org Artificial Intelligence

Age-related language patterns play a crucial role in understanding linguistic differences and developing age-appropriate communication strategies. However, the lack of comprehensive and diverse datasets has hindered the progress of research in this area. To address this issue, we present TextAge, a curated text dataset that maps sentences to the age and age group of the producer, as well as an underage (under 13) label. TextAge covers a wide range of ages and includes both spoken and written data from various sources such as CHILDES, Meta, Poki Poems-by-kids, JUSThink, and the TV show "Survivor." The dataset undergoes extensive cleaning and preprocessing to ensure data quality and consistency. We demonstrate the utility of TextAge through two applications: Underage Detection and Generational Classification. For Underage Detection, we train a Naive Bayes classifier, fine-tuned RoBERTa, and XLNet models to differentiate between language patterns of minors and young-adults and over. For Generational Classification, the models classify language patterns into different age groups (kids, teens, twenties, etc.). The models excel at classifying the "kids" group but struggle with older age groups, particularly "fifties," "sixties," and "seventies," likely due to limited data samples and less pronounced linguistic differences. TextAge offers a valuable resource for studying age-related language patterns and developing age-sensitive language models. The dataset's diverse composition and the promising results of the classification tasks highlight its potential for various applications, such as content moderation, targeted advertising, and age-appropriate communication. Future work aims to expand the dataset further and explore advanced modeling techniques to improve performance on older age groups.


Exploring Gender Biases in Language Patterns of Human-Conversational Agent Conversations

arXiv.org Artificial Intelligence

With the rise of human-machine communication, machines are increasingly designed with humanlike characteristics, such as gender, which can inadvertently trigger cognitive biases. Many conversational agents (CAs), such as voice assistants and chatbots, default to female personas, leading to concerns about perpetuating gender stereotypes and inequality. Critiques have emerged regarding the potential objectification of females and reinforcement of gender stereotypes by these technologies. This research, situated in conversational AI design, aims to delve deeper into the impacts of gender biases in human-CA interactions. From a behavioral and communication research standpoint, this program focuses not only on perceptions but also the linguistic styles of users when interacting with CAs, as previous research has rarely explored. It aims to understand how pre-existing gender biases might be triggered by CAs' gender designs. It further investigates how CAs' gender designs may reinforce gender biases and extend them to human-human communication. The findings aim to inform ethical design of conversational agents, addressing whether gender assignment in CAs is appropriate and how to promote gender equality in design.


Coherent Wave Dynamics and Language Generation of a Generative Pre-trained Transformer

arXiv.org Artificial Intelligence

Large Language Models (LLMs), such as the Generative Pretrained Transformer (GPT), have achieved tremendous success in various language tasks, but their emergent abilities have also raised many questions, concerns, and challenges that need to be addressed. To gain a better understanding of the models' inner mechanisms, we analyze the hidden state and channel wave dynamics in a small GPT, focusing on the coherence of wave patterns in terms of cross-channel correlation and individual auto-correlation. Our findings suggest that wave dynamics offer consistent and repeatable intrinsic oscillation modes, along with context-aware plasticity and expressiveness in language generation. By analyzing wave patterns, coherence, and clustering, we provide a systematic way to identify and interpret the functionality of the hidden state channels, paving the way to understand and control higher-level language pattern formation. In addition, we investigate the Poisson statistics of spelling errors in text sequence generation across various levels of model training and observe a phase-transition-like process. As coherence builds up, there is a competition between the generation of correct and misspelled words. However, once the model is adequately trained and significant coherence has emerged, the coherent process becomes strong enough to effectively suppress spelling errors, preventing the cascade amplification of defects. The distribution of correct spellings transitions from Poissonian to Sub-Poissonian, while the distribution of misspellings shows the opposite trend. By leveraging concepts and techniques from quantum physics, we gain novel insights into the dynamics of the small GPT. This approach can be extended to larger language models that exhibit more complex coherent language patterns, opening up opportunities to interpret their emergent capabilities and develop more specialized models.


FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering

arXiv.org Artificial Intelligence

The widely used Fact-based Visual Question Answering (FVQA) dataset contains visually-grounded questions that require information retrieval using common sense knowledge graphs to answer. It has been observed that the original dataset is highly imbalanced and concentrated on a small portion of its associated knowledge graph. We introduce FVQA 2.0 which contains adversarial variants of test questions to address this imbalance. We show that systems trained with the original FVQA train sets can be vulnerable to adversarial samples and we demonstrate an augmentation scheme to reduce this vulnerability without human annotations.


Understanding Chatbots part2(Artificial Intelligence)

#artificialintelligence

Abstract: In this paper, we propose a natural language processing architecture that can handle tasks that previously required two models as one model. With a single model, we analyze the language patterns and conversational context of Alzheimer's patients and derive answers from two results: patient classification and chatbot. If the patient's language characteristics are identified by chatbots in daily life, doctors can plan more precise diagnosis and treatment for early diagnosis. The proposed model is used to develop chatbots that replace questionnaires that required experts. There are two natural language processing tasks performed by the model.